Sequence Learning and Speech Recognition

نویسندگان

  • Thomas Breuel
  • Martin Krämer
  • Daniel Keysers
چکیده

This paper describes application possibilities for statistical methods and particulary hidden Markov models in the domain of sequence learning after treating the required basics. Especially the task of natural language processing is treated by elaborating on speech recognition and speech translation. Furthermore a network intrusion detection system to detect complex and coordinated Internet attacks is presented briefly to illustrate the universality in the concept of sequence learning. 1 Fundamentals In this section a brief overview of the needed fundamentals to understand the further discussed concepts is given and the domain of sequence learning is outlined. Further literature references are given at adequate spots to permit the consultation of more detailed background information. 1.1 Notation Overview Some syntax rules for better comprehension: • P denotes probabilities whereas p is used for probability densities. • Calligraphic letters like V are employed for sets. • Capital letters like N are utilized for constant values. 1.2 Statistical Basics Here we define stochastic processes and Markov chains – a time-restricted stochastic process. Both concepts are essential to understand the principle of hidden Markov models. Some very elementary statistical principles – that means Bayes’ rule, random variables, probability distributions and similar things are assumed to be familiar. 1.2.1 Stochastic Processes A stochastic process constitutes a series of random variables q1, q2, . . . which adopt values Qt out of a discrete or continous domain according to specific probability distributions. The distribution function of the Qt may depend on the current state qt and its predecessors q1, q2, . . . , qt−1. The stochastic process is called stationary if its behaviour is independent of the absolute value of time t. It is called causal if additionally the distribution of Qt is only a function of the previous states q1, q2, . . . , qt−1, i.e. it does not rely on states in the future. P (qt = Qt|q1 = Q1, q2 = Q2, . . . , qt−1 = Qt−1)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Recognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model

Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....

متن کامل

Seismic Data Forecasting: A Sequence Prediction or a Sequence Recognition Task

In this paper, we have tried to predict earthquake events in a cluster of seismic data on pacific ring of fire, using multivariate adaptive regression splines (MARS). The model is employed as either a predictor for a sequence prediction task, or a binary classifier for a sequence recognition problem, which could alternatively help to predict an event. Here, we explain that sequence prediction/r...

متن کامل

یادگیری توالی‌های حرکتی گفتار در بزرگ‌سالان مبتلا به لکنت

Objective Developmental stuttering is a speech disorder characterized by repetition, prolongation, block and disruption of the smooth flow of speech. Environmental, physical, mental, and cognitive-linguistic factors were involved in the initiation and development of stuttering. There have been several theories about the development of stuttering. One of these theories suggests that stuttering i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006